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Abstract 



The present review examines research on the effects of technology use on reading achievement 
in K-12 classrooms. Unlike previous reviews, this review applies consistent inclusion standards 
to focus on studies that met high methodological standards. In addition, methodological and 
substantive features of the studies are investigated to examine the relationship between education 
technology and study features. A total of 85 qualified studies based on over 60,000 K-12 
participants were included in the final analysis. Consistent with previous reviews of similar 
focus, the findings suggest that education technology generally produced a positive, though 
small, effect (ES=+0. 16) in comparison to traditional methods. However, the effects may vary 
by education technology type. In particular, the types of supplementary computer-assisted 
instruction programs that have dominated the classroom use of education technology in the past 
few decades are not producing educationally meaningful effects in reading for K-12 students. In 
contrast, innovative technology applications and integrated literacy interventions with the 
support of extensive professional development showed somewhat promising evidence. 

However, too few randomized studies for these promising approaches are available at this point 
for firm conclusions. 
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Introduction 



The classroom use of education technology such as computers, interactive whiteboards, 
multimedia, and the internet, has been growing at a phenomenal rate in many countries in the last 
two decades. According to a recent survey conducted by the U.S. Department of Education 
(SETDA 2010) on the use of education technology in US public schools, almost all public 
schools had one or more instructional computers with internet access, and the ratio of students to 
instructional computers with internet access was 3.1 to 1. In addition, 97% of schools had one or 
more instructional computers located in classrooms and 58% of schools had laptops on carts. A 
majority of public schools surveyed also indicated their schools provided various education 
technology devices for instruction: LCD (liquid crystal display) and DLP (digital light 
processing) projectors (97%), digital cameras (93%), and interactive whiteboards (73%). The 
U.S. Department of Education provides generous grants to state education agencies to support 
the use of education technology in K-12 classrooms. For example, in fiscal year 2009, the 
Department made a $900 million investment in education technology in elementary and 
secondary schools (SETDA, 2010). 

Though research on the effectiveness of education technology for improving learning 
outcomes is abundant, previous studies suffer from a number of problems typical in educational 
research: small sample size, brief intervention (e.g., Foster, Erickson, Foster, Brinkman, & 
Torgesen, 1994), lack of control group (e.g., Moseley, 1993), and poorly described treatments. 
Perhaps the biggest problem is that many studies claiming to be studies of technology confound 
use of technology with one-to-one tutoring, small-group tutorials, or other teaching strategies 
known to be effective without technology (e.g., Barker & Torgesen, 1995; Ehri, Dreyer, 
Flugman, & Gross, 2007; Torgensen, Wagner, Rashotte, Herron, & Lindamood, 2010; Wentink, 
Van Bon, & Schreuder, 1997). The purpose of this review is to examine all studies that meet 
well-justified standards of methodological rigor reported since the 1970s to look at the overall 
effectiveness of education technology for enhancing reading achievement in K-12 classrooms. 

Previous Reviews 

Several major meta-analyses of the impact of education technology on reading have been 
conducted in the past two decades (Becker, 1992; Blok, Oostdam, Otter, & Overmatt, 2002; 
Fletcher-Finn & Gravatt, 1995; C. L. C. Kulik & J. A. Kulik, 1991; J. A. Kulik, 2003; Ouyang, 
1993; Soe, Koki, & Chang, 2000). Overall, all came to a similar conclusion, that education 
technology generally produced small to moderate effects on reading outcomes with effect sizes 
ranging from +0.06 to +0.43. For example, Blok, Oostedam, Otter, & Overmatt (2002) 
examined 42 studies from 1990 onward and found an overall effect size of +0.19 in support of 
education technology for K-3 students. Their conclusion was consistent with the findings of 
earlier reviews by Becker (1992), Fletcher-Finn & Gravatt (1995), Kulik & Kulik (1991); 

Ouyang (1993); and Soe, Koki, & Chang (2000). 
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Insert Table 1 here 



A more recent review was conducted by Kulik (2003) on the impact of education 
technology on various subjects. For reading, a total of 27 studies focusing on three major 
applications of technology to reading instruction were included: integrated learning systems, 
writing-based reading programs, and reading management programs. Results varied by program 
type. No significant positive effect was found in the nine controlled studies of integrated 
learning system. However, moderate positive effects were found in the 13 writing-based reading 
program studies such as Writing to Read, with an overall effect size of +0.41, and in the three 
reading management program studies {Accelerated Reader), with an average effect size of +0.43. 
Of particular relevance to our review are the two meta-analyses by Kulik & Kulik (1991) and 
Soe, Koki, & Chang (2000), which had a focus on K-12 classrooms. Both reviews found a 
positive but modest effect of education technology on reading performance (ES=+0.25 and 
+0.13, respectively) for K-12 students. 

Probably the most often-cited review in education technology was conducted by Kulik 
and Kulik (1991), who viewed computers as valuable tools for teaching and learning. 
Specifically, they claimed that: 

1 . Education technology was capable of producing positive but small effects on student 
achievement (ES=+0.30). 

2. Education technology could produce substantial savings in instruction time 
(ES=+0.70). 

3. Education technology fostered positive attitudes toward technology (ES=+0.34). 

4. In general, education technology could be used to help learners become better 
readers, calculators, writers, and problem solvers. 

However, Clark (1983; 1985; 1985; 1994) argued that there was not enough evidence to 
say that educational technology is more effective than other teaching methods. He believed that 
the achievement gains in many of these studies might be due to novelty effects or to the 
instructional strategies used with the computers, but not the media itself. In addition, many 
studies included in these major reviews do not meet minimal standards of methodological 
adequacy. For example, 10 of the 42 studies included in Blok’s review did not include a control 
group. Furthermore, it is quite possible that the positive achievement outcomes from some of 
these so-called technology studies might not be caused by the technology itself, but rather by the 
extended learning time for additional practice. 

The need to re-examine research on the effectiveness of technology for reading outcomes 
has been heightened by the publication of a large-scale, randomized evaluation of modern 
computer-assisted instruction programs by Dynarski et al. (2007) and Campuzano et al. (2009). 
Teachers within schools were randomly assigned to use any of 5 first grade programs and any of 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



4 fourth grade programs, or to control groups. At both grade levels and in both years of the 
evaluation, effect sizes were near zero. The overall effect size was +0.04 for first grade and 
+0.02 for fourth grade. The second year evaluation allowed for computation of effect sizes for 
each CAI program separately, and these comparisons found that none of the programs had 
notable success in reading. 

This large-scale, third-party federal evaluation raises troubling questions about the 
effectiveness of technology for elementary reading outcomes. The Dynarski et al. (2007) and 
Campuzano et al. (2009) effect sizes were much lower than the effect sizes reported from all of 
the earlier research reviews. The study’s use of random assignment, a large sample size, and 
careful measurement to evaluate several modern commercial CAI programs, raises important 
questions about the effectiveness of the technology applications that have been most common in 
education for many years. Do the Dynarski/Campuzano findings confonn with those of other 
high-quality evaluations? Are there technology applications different from the supplemental CAI 
programs studied by Dynarski/Campuzano that have great promise? What can we learn from the 
whole literature on technology applications to inform future research and practice in this critical 
area? 



The present review examines research on the effects of technology use on reading 
achievement in K-12 classrooms. Unlike most previous reviews, this review applies consistent 
inclusion standards to focus on studies that met high methodological standards. In addition, 
methodological and substantive features of the studies are investigated to examine the 
relationship between education technology and study features. 

Method 

The current review employed meta-analytic techniques proposed by Glass, McGaw & 
Smith (1981) and Lipsey & Wilson (2001). Comprehensive Meta-analysis Software Version 2 
(Borenstein, Hedges, Higgins, & Rothstein, 2005) was used to calculate effect sizes and to carry 
out various meta-analytic al tests, such as Q statistics and sensitivity analyses. Like many 
previous meta-analyses, this study follows several key steps: 1. Locating all possible studies; 2. 
Screening potential studies for inclusion using preset criteria; 3. Coding all qualified studies 
based on their methodological and substantive features; 4. Calculating effect sizes for all 
qualified studies for further combined analyses; and 5. Carrying out comprehensive statistical 
analyses covering both average effects and the relationships between effects and study features. 

Literature Search Procedures 



In an attempt to locate every study that could possibly meet the inclusion criteria, a 
literature search of articles written between 1970 and 2010 was carried out. Electronic searches 
were made of educational databases (e.g., JSTOR, ERIC, EBSCO, Psych INFO, Dissertation 
Abstracts), web-based repositories (e.g., Google Scholar), and educational technology 
publishers’ websites, using different combinations of key words (e.g. education technology, 
instructional technology, computer-assisted instruction, interactive whiteboards, multimedia, 
reading interventions, etc). We also conducted searches by program name. We attempted to 
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contact producers and developers of educational technology programs to check whether they 
knew of studies that we had missed. References from other reviews of educational technology 
programs were further investigated. We also conducted searches of recent tables of contents of 
key journals from 2000 to 2010: Educational Technology and Society, Computers and 
Education, American Educational Research Journal, Reading Research Quarterly, Journal of 
Educational Research, Journal of Adolescent & Adult Literacy, Journal of Educational 
Psychology, and Reading and Writing Quarterly. Citations in the articles from these and other 
current sources were located. 

Criteria for Inclusion 



In order to be included in this review, studies had to meet the following inclusion criteria 
(see Slavin, 2008, for rationales). 

1 . The studies evaluated any type of education technology, including computers, 
multimedia, and interactive whiteboards, and other technology. 

2. The studies involved students in grades K-12. 

3. The studies compared students taught in classes using a given technology-assisted 
reading program to those in control classes using an alternative program or standard 
methods. 

4. Studies could have taken place in any country, but the report had to be available in 
English. 

5. Random assignment or matching with appropriate adjustments for any pretest differences 
(e.g., analyses of covariance) had to be used. Studies without control groups, such as pre- 
post comparisons and comparisons to “expected” scores, were excluded. Studies in which 
students selected themselves into treatments (e.g., chose to attend an after-school 
program) or were specially selected into treatments (e.g., gifted or special education 
programs) were excluded unless experimental and control groups were designated after 
selections were made. 

6. Pretest data had to be provided, unless studies used random assignment of at least 30 
units (individuals, classes, or schools) and there were no indications of initial inequality. 
Studies with pretest differences of more than 50% of a standard deviation were excluded 
because, even with analyses of covariance, large pretest differences cannot be adequately 
controlled for as underlying distributions may be fundamentally different (Shadish, Cook, 
& Campbell, 2002). 

7. The dependent measures included quantitative measures of reading performance, such as 
standardized reading measures. Experimenter-made measures were accepted if they were 
comprehensive measures of reading, which would be fair to the control groups, but 
measures of reading objectives inherent to the program (but unlikely to be emphasized in 
control groups) were excluded. Measures of skills that do not require interpretation of 
print, such as phonemic awareness, oral vocabulary, or writing, were excluded. 

8. A minimum study duration of 12 weeks was required. This requirement is intended to 
focus the review on practical programs intended for use for the whole year, rather than 
brief investigations. Brief studies may not allow programs to show their full effect. On 
the other hand, brief studies often advantage experimental groups that focus on a 
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particular set of objectives during a limited time period while control groups spread that 
topic over a longer period. Studies with brief treatment durations that measured outcomes 
over periods of more than 12 weeks were included, however, on the basis that if a brief 
treatment has lasting effects, it should be of interest to educators. 

9. Studies had to have at least two teachers in each treatment group. 

10. Studied programs should be replicable in realistic school settings. Studies providing 
experimental classes with extraordinary amounts of assistance that could not be provided 
in ordinary applications were excluded. 

Both the first and second author looked at each potential study independently. When 
disagreement arose, both authors reexamined the studies in question together and came to a final 
agreement. 

Study Coding 



To examine the relationship between effects and studies’ methodological and substantive 
features, studies needed to be coded. Methodological features included research design and 
sample size. Substantive features included grade levels, types of education technology 
programs, program intensity, level of implementation, and socio-economic status. In addition, 
ability, SES, gender, and race were coded for subgroup analyses. 

Effect Size Calculations and Statistical Analyses 

In general, effect sizes were computed as the difference between experimental and 
control individual student posttests after adjustment for pretests and other covariates, divided by 
the unadjusted posttest pooled SD. Procedures described by Lipsey & Wilson (2001) and 
Sedlmeier & Gigerenzor (1989) were used to estimate effect sizes when unadjusted standard 
deviations were not available, as when the only standard deviation presented was already 
adjusted for covariates or when only gain score SD’s were available. If pretest and posttest 
means and SD’s were presented but adjusted means were not, effect sizes for pretests were 
subtracted from effect sizes for posttests. After calculating individual effect sizes for all 89 
qualifying studies, Comprehensive Meta- Analysis software was used to carry out all statistical 
analyses such as Q statistics and overall effect sizes. 

Findings 



Overall Effects 

A total of 85 qualified studies based on 60,721 K-12 participants were included in the 
final analysis: 8 kindergarten studies (N=2,068), 59 elementary studies (N=34,200), and 18 
secondary studies (N=24,453). As indicated in Table 2, the overall mean effect size for the 85 
qualified studies is +0. 16. The distribution of effect sizes in this collection of studies is highly 
heterogeneous (Q=362.52, df=84, p<0.00), indicating that the variance of study effect sizes is 
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larger than can be explained by simple sampling error. Thus, a random effects model was used 1 . 
As will be discussed in a later section, some methodological features (e.g. research design, 
sample size) and substantive features (e.g. type of intervention, grade level, SES) were used to 
model some of these variations. 



Insert Table 2 here 



Sensitivity Analysis 

A sensitivity analysis was performed to check any possible outliers that may skew the 
overall results. Using a “one-study removal” analysis (Borenstein, et ah, 2009) we found that 
the range of effect sizes still falls within the 95% confidence interval (0.12 to 0.21). In other 
words, the removal of any one effect size does not substantially affect the overall effect sizes. 

Publication Bias 



Two statistical analyses were performed to check whether there was a significant number 
of studies with null results that have not been uncovered through a search of the literature to 
nullify the effects found in the meta-analysis: Classic fail-safe N and Orwin’s fail-safe N. As 
indicated in Table 3, the classic fail-safe N test determined that a total of 4,198 studies with null 
results would be needed in order to nullify the effect. The Orwin’s test (Table 4) estimates the 
number of missing null studies that would be required to bring the mean effect size to a trivial 
level. We set 0.01 as the trivial value. The result indicated that the number of missing null 
studies to bring the existing overall mean effect size to 0.01 was 880. Taken together, these 
results suggest that there is no reason to believe that publication bias could account for the 
positive effect size. 



Insert Table 3 & 4 here 



1 A random-effects model was used for three reasons. First, the test of heterogeneity in effect sizes was statistically 
significant. Second, the studies for this review were drawn from populations that are quite different from each 
other, e.g. age of the participants, types of intervention, research design, etc. Third, the random-effects model has 
been widely used in meta-analysis because the model does not discount a small study by giving it a very small 
weight, as is the case in the fixed-effects model (Borenstein, Hedges, Higgins, & Rothstein, 2009; Dersimonian & 
Laird, 1986; Schmidt, Oh, & Hayes, 2009). The average effect size using a fixed-effects procedure was only +0. 1 1 
(see Table 2) 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



As an additional test of the possibility of publication bias, we used a mixed-effects model 
to test whether there was a significant difference between published journal articles and 
unpublished publications such as technical reports and dissertations. As indicated in Table 5, 
the overall effect sizes for published articles and unpublished reports are +0.25 and +0. 14, 
respectively. The Q-value (Q B = 4.44, df=l, and p<0.04) does indicate publication bias in this 
collection of studies. In other words, the effect sizes from the published journal articles were 
significantly larger than those in technical reports and dissertations. 



Insert Table 5 here 



Year of Publication 



We were also interested in looking at whether there were any differences among studies 
according to their publication year. Earlier reviews found suggestive evidence that effectiveness 
of education technology was improving over time as technology became more sophisticated and 
advanced (Fletcher-Finn & Gravatt, 1995; J. A. Kulik & Kulik, 1987; Niemiec & Walberg, 
1987). For example, Niemiec and Walberg (1987) found that the average effect size for 
microcomputer-based instruction (ES=+1.12) was three times larger than that of computer-based 
instruction delivered through mainframes (ES=+0.38). Kulik & Kulik (1987) also detected a 
similar pattern. In their meta-analyses on computer-based instruction, they found that the 
average effect for studies from 1966-1974 was +0.24 whereas studies from 1974 to 1984 had a 
larger overall effect size of +0.36. Fletcher-Finn & Gravatt (1995) reported that the average 
effect size for computer-assisted instruction was +0.24 for the years 1987-1992, but the effect 
size for more recent studies was +0.33. However, the present review found no trend toward 
more positive results in recent years (see Table 6). Means for each time period were close to the 
overall mean effect size of +0. 16. 



Insert Table 6 here 



Methodology Features 

As indicated in Table 2, the value of the Q statistic suggests that there is considerable 
variation in effect sizes across studies. In order to understand possible reasons for such 
variation, we examined two key potential methodological features that may help explain some of 
the variation: research design and sample size. 

Research Design. One potential source of variation is the presence of different research 
designs (e.g., Abrami & Bernard, 2006). Four categories of research design were identified in 
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this collection of studies. Randomized experiments (N=25) were those in which students, 
classes, or schools were randomly assigned to conditions and the unit of analyses was at the level 
of the random assignment. Randomized quasi-experiments (RQE) (N=3) refer to studies that 
used random assignment at the school or class level but the analysis was done at the student level 
due to too few schools or classes. Matched control (N=48) studies were ones in which 
experimental and control groups were matched on key variables at pretest, before posttests were 
known, while matched post-hoc studies (MPH) (N=9) were ones in which groups were matched 
retrospectively, after posttests were known. Table 7 presents the outcomes of the analyses 
according to research designs. The average effect size for randomized experimental studies, 
randomized quasi experiments, matched control studies, and matched post hoc studies were 
+0.08, +0.16, +0.19, and +0.19, respectively. Since there were only three RQE studies and the 
effect sizes of the matched and MPH studies were similar, we decided to combine these three 
quasi-experimental categories into one category and compared it to randomized experiments. 
Results are shown in Table 8. The mean effect size for quasi-experimental studies was +0.19, 
twice the size of that for randomized studies. As a group, randomized evaluations had effect 
sizes like those reported in the Dynarski/Campuzano study, while quasi-experiments had higher 
estimates. 



Insert Table 7 and 8 here 



Sample Size. Another potential source of variations may lie in differences in study sample 
size. Previous studies suggest that studies with small sample sizes produce larger effect sizes 
than do large studies (Liao, 1999; Slavin & Smith, 2009). In this collection of studies, there 
were a total of 49 large studies with sample sizes greater than 250 and 36 small studies with 
fewer than 250 students. As indicated in Table 9, a statistically significant difference was found 
between large studies and small studies (Q B =4.66, df=l, and p<0.03). The mean effect size for 
the 40 small studies (ES=+0.25) was twice that of large studies (ES=+0.13). 



Insert Table 9 here 



Design/Size. After examining the effect of research design and sample sizes separately, 
we then looked at the combined effect of these two moderator variables together. As shown in 
Table 10, the difference among the four groups was significant (Q m =12.37 and p<0.00). Small 
matched control studies produced the largest effect size (ES=+0.24), followed by small 
randomized studies (ES=+0.21), large matched control studies (ES=+0.16), and large 
randomized studies (ES=+0.07). Within each research design, the effect sizes of small studies 
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were about twice as large as those of large studies. The findings for the large randomized 
studies, as a group, resembled those of the Dynarski/Campuzano study. 



Insert Table 10 here 



Substantive Features 



In addition to methodological features, substantive features were also examined to help 
explain some of the variation in the model. Five key substantive features were identified and 
examined: Grade levels, types of intervention, program intensity, level of implementation, and 
socio-economic status. 

Grade Levels. Studies were organized in three grade levels: Kindergarten (N=8), 
Elementary (N=59), and Secondary (N=18). The results by grade levels are shown in Table 11. 
The effect sizes for kindergarten, elementary, and the secondary level were +0.15, +0.10, and 
+0.31, respectively. The between-group difference (£ B =9.52, df=2, p<0.01) was significant. 
The post hoc test suggests that the effect size at the secondary level was significantly higher than 
that at the kindergarten and elementary levels. 

Types of intervention. In terms of intervention type, the studies were divided into four 
major categories: Computer-Managed Learning (CML) (N=4), Innovative Technology 
Applications (IT A) (N=6), Comprehensive models (N=18), and Supplemental Technology 
(N=57). The majority of the studies (67%) fell into the supplementary program category. These 
supplementary programs, such as Destination Reading, Plato Focus, Waterford, and WICAT, 
provide additional instruction at students’ assessed levels of need to supplement traditional 
classroom instruction. These were the types of programs evaluated in the Dynarski/Campuzano 
evaluation. Innovative Technology Applications included Fast For Word, Reading Reels, and 
Lightspan. Computer-Managed Learning Systems included only Accelerated Reader. This 
program uses computers to assess students’ reading levels, assigning reading materials at 
students’ levels, scoring tests on those readings, and charting students’ progress, but students do 
not work directly on the computer. Comprehensive models, represented by READ 180, Writing 
to Read, and Voyager Passport, are methods that use computer-assisted instruction along with 
non-computer activities as students’ core reading approach. 



Insert Table 1 1 here 
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Table 12 presents the summary results of the analyses by program types. A marginally 
significant between-group effect (g B =7.15, df=3, p<0.07) was found, indicating some variations 
among the four programs. The 18 comprehensive model studies produced the largest effect size, 
+0.28, and the four computer managed learning and the six innovative technology applications 
produced similar moderate effect sizes of +0.19 and +0.18, respectively. The average effect size 
for the 57 supplemental technology programs was only +0. 1 1 . The results of the analyses of 
CML and ITA data have to be considered carefully, however, due to the small number of studies 
in these categories. 



Insert Table 12 here 



Program intensity. Program intensity may help explain some of the variation in the 
model. Program intensity was divided into two categories: low intensity (the use of technology 
less than 15 minutes a day or less than 75 minutes a week) and high intensity (over 15 minutes a 
day or 75 minutes a week). Analyzing the use of technology as a moderator variable, no 
significant difference was found between the two intensity categories ((9b = 3.04, df=l, p=0.08). 
This result suggests that more technology use does not necessarily result in better outcomes. The 
effect sizes for low and high intensity are +0.1 1 and +0.19, respectively. 



Insert Table 13 here 



Level of Implementation. Significant differences were found among low, medium, and 
high levels of implementation as reported by the researchers. The mean effect sizes for low, 
medium, and high implementation were +0.01, +0.18, and, +0.22, respectively. Over half of the 
studies (53%) did not provide sufficient information about implementation. It is clear from the 
findings that no effect was found when implementation was described as low. A significant and 
positive effect was detected for groups that had a medium or high level of implementation rating. 
The implementation ratings must be considered cautiously, however, because authors who knew 
that there were no experimental-control differences may have described poor implementation as 
the reason, while those with positive effects might be less likely to describe implementation as 
poor. For example, Patterson et al (2003) did not find significant differences between the 
treatment and control groups for their Waterford study and they concluded that “it could be 
argued that the Waterford failed to produce promised results because the teachers did not 
implement it appropriately or that differences in use among the eight classrooms contributed to 
better results for some than for others” (p. 200). 
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Insert Table 14 here 



Socio-economic status (SES). Studies were divided into three categories: Low, mixed, 
and high SES. Low SES refers to studies that had 40% or more students receiving free and 
reduced-price lunch and high SES less than 40%. Four studies that involved a diverse 
population, including both low and high SES students, were excluded in these analyses. The p- 
value (0.31) of the test of heterogeneity in effect sizes suggests that the variance in the sample of 
effect sizes were within the range that could be expected based on sampling error alone. The 
effect sizes for low and high SES were +0.17 and +0.12, respectively, indicating a minimal effect 
of SES (Table 15). In addition to the between-study comparison, we also looked at the 
differential impact of instructional technology on students with different SES background within 
studies. There were a total of ten studies identified. As shown in Table 16, education 
technology had a slightly higher positive impact on low SES students with an average effect of 
+0.3 1, whereas the effect for high SES students was +0.20. Due to low power, no significant 
difference was found between low SES and high SES groups. 



Insert Table 15 and 16 here 



Within-Study Subgroup Analyses 

Besides looking at methodological and substantive features, subgroup analyses of 
comparisons within studies were also conducted to compute differential mean effect sizes based 
on student demographic characteristics such as student ability, gender, race, and language. 

Because the number of studies in these subgroup analyses was small, it is difficult to estimate the 
between-studies variance (Tau Square) with any precision. Thus the fixed-effects model was 
used. Interpretation of some of these results also needs to be tentative due to the small number 
of studies involved. These initial findings need to be verified with additional studies. 

Ability. Out of the 85 qualifying studies, there were a total of 13 studies that examined the 
impact of instructional technology on students with different academic abilities, yielding 29 effect 
sizes. The mean effect sizes for low, middle, and high ability students were +0.37, +0.27, and 
+0.08, respectively. The post hoc tests suggest that instructional technology had a more positive 
impact on low and middle ability students than it did on high ability students. 
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Insert Table 17 here 



Gender. As indicated in Table 18, instructional technology generated a more positive 
impact among males than females. The effect sizes for males and females were +0.28 and +0. 12, 
respectively. No significant difference according to gender was found, however, due to low 
power. 



Insert Table 18 here 



Race. A total of seven studies examined the interaction effect of race with the use of 
education technology. The mean effect sizes for students who were African American, Hispanic, 
and White were +0. 12, +0.42, and +0. 1 1 . The numbers of studies with each group was small, 
however, and there was only one study on a Hispanic population. 



Insert Table 19 here 



English Language Learners. Only three studies examined the effect of instructional 
technology on English language learners. The effect size was +0.29 (p<0.05). 



Insert Table 20 here 



Discussion 



The purpose of this review was to examine the overall effectiveness of education 
technology on reading outcomes in K-12 classrooms. Important methodological and substantive 
moderator variables, such as research design, sample size, type of intervention, and program 
intensity were used to examine whether outcomes were different according to these study 
features. Furthermore, sub-analyses were conducted to look at the differential impact on key 
subgroups such as gender, race, and SES. 
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Consistent with previous reviews of similar focus, the findings of this study suggest that 
education technology generally produced a positive, though small, effect (ES=+0.16) in 
comparison to traditional methods. This effect is much larger than those reported in the recent 
large, randomized evaluation of current commercial CAI models by Dynarski et al. (2007) and 
Campuzano et al. (2009). Yet to the degree other studies have resembled aspects of 
Dynarski/Campuzano, the outcomes have also been more similar. In particular, studies of 
traditional, supplementary CAI, studies that used random assignment, and studies with large 
sample sizes (all of which are characteristics of the Dynarski/Campuzano studies) found smaller 
effect sizes than other studies. 

Qualifying studies provide greater support for technology applications other than 
supplementary CAI, which had an overall effect size of +0. 1 1 . Out of the 57 qualifying 
supplemental instructional technology studies, 19 of them were rigorous randomized 
experiments (e.g., Alifranglis, 1991; Becker, 1994, Campuzzano et al, 2009; Vaughan, Serio & 
Wilhelm, 2006), involving a total of approximately 1 1,000 students. The majority of these 
qualifying studies (53%) were conducted since 2000. Only one study was conducted in the 70s, 
12 studies in 80s, and 13 in 90s. We found no trend toward more positive effects in more recent 
studies. The study by Dunarski et al. (2007) and Campuzzano et al (2009) evaluated a total of six 
supplemental programs, including Destination Reading, Headsprout, Plato Focus, Waterford 
Early Reading Program, Academy of Reading, and LeapTrack, and found minimal effects of 
these supplemental programs, with effect sizes ranging from -0.01 to +0. 1 1 . The evidence from 
these high quality randomized studies with large samples clearly suggests that the types of 
supplementary computer-assisted instruction programs that have dominated the classroom use of 
education technology in the past few decades are not producing educationally meaningful effects 
in reading for K-12 students. 

In contrast to studies of supplementary CAI, the largest effects were found in the 18 
studies of comprehensive models, including READ 180, Writing to Read, and Voyager Passport, 
with an overall effect size of +0.28. Unlike supplemental computer-assisted instruction models, 
READ 180 and Voyager Passport, the two widely used secondary reading approaches, are 
intended to serve as integrated literacy interventions, which combine computer and non- 
computer instruction in their classrooms, with the support of extensive professional 
development. For example, in READ 180, a widely used secondary model for struggling readers, 
classrooms are provided with 90 minutes a day of instruction in a group of 15. Each period 
begins with a 20-minute shared reading and skills lesson, and then students in groups of 5 rotate 
among three activities: computer-assisted instructional reading, modeled or independent reading, 
and small-group instruction with the teacher. Teachers are given materials and professional 
development to support instruction in reading strategies, comprehension, word study, and 
vocabulary. These comprehensive approaches have a much greater impact on reading instruction 
and on reading outcomes than the ordinary CAI models, but studies of them do not isolate the 
unique contribution made by the use of technology. Further, none of the studies conducted to 
date for READ 180 and Voyager Passport were randomized, and our findings suggest that non- 
randomized studies of technology applications overstate effect sizes. In short, too few 
randomized studies for comprehensive approaches are available at this point for firm 
conclusions. Researchers and developers need to examine the effect of these promising programs 
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by using rigorous experimental designs. 

Other technology applications may also have greater promise than supplementary CAI, 
but again, the numbers of studies of each is small. A single matched evaluation of Lightspan 
(Birch, 2002), which integrates video and computer content on Sony Playstations as used at 
school and at home, found substantial positive effects (ES=+0.42), but this was a matched 
evaluation involving only two schools. Reading Reels, a program that adds multimedia content to 
the Success for All whole-school reform model, was found in two randomized experiments to 
add significantly to the reading outcomes of Success for All, with effect sizes of +0. 17 
(Chambers et ah, 2006), and +0.27 (Chambers et ah, 2008). 

In addition to these overall findings, several key findings emerging from this review 
warrant mention. First, the majority of the qualifying studies (71%) included in this review were 
quasi-experiments, including matched control, randomized quasi-experiments, and matched post- 
hoc experiments. Out of the 85 qualified studies, only 25 (29%) were randomized experiments. 
Eight out of the 25 randomized studies were conducted by Campuzzano et al and Dynarski et al 
in 2007 and 2009, respectively. The present findings point to an urgent need for more practical 
randomized studies in the area of education technology. 

Second, our findings indicate that studies with small sample sizes generally produced 
twice the effect sizes of those with large sample sizes. The results support the findings of other 
research studies (Pearson, Ferding, Blomeyer, & Moran, 2005; Slavin & Smith, 2009). This 
should come as no surprise for three reasons. First, it is much easier for researchers to maintain 
high implementation fidelity in small-scale studies as compared to large-scale studies. In 
addition, standardized tests were more likely to be used in large scale studies, which are usually 
less sensitive to treatments. Finally, small studies with null effects may have never been written 
up or made available in published or report forms. 

Third, in contrast to previous reviews (e.g., Kulik & Kulik, 1991), we found a significant 
difference between experimental and quasi-experimental designs. Our findings suggest that the 
effect sizes were generally twice as large in quasi-experiments than in true experiments. 

Fourth, a differential impact of education technology at different grade levels was found. 
The use of education technology had a larger impact at the secondary level than at any other 
grade levels, with a mean effect size of +0.3 1 . However, the results need to be interpreted with 
caution. First, only two of the eighteen qualified secondary studies were randomized 
experiments. As mentioned earlier, the effects were likely to be larger in quasi-experiments. In 
addition, the 18 qualified secondary studies were dominated by two intervention programs: three 
from Accelerated Reader, and eight from READ 180. The findings suggest that randomized 
studies are particularly needed at the secondary level. 

Fifth, no significant differences were found regarding program intensity. More 
technology does not necessarily result in better outcomes. Future studies may want to 
investigate the impact of the time variable factor in depth for various grades. 
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Finally, it appears that the use of education technology had somewhat greater benefits for 
low ability and ELL students. Given the current focus on intervention for low performing and 
ELL students, schools and districts may consider adopting appropriate proven education 
technology programs in order to close the language and ability gaps, especially in reading. 
However, there are few studies that compare outcomes by ability or ELL status. Further studies 
on these subgroups are needed in order to improve internal and external validity of these 
findings. 

Conclusions 



The findings of this review support those of earlier reviews by other researchers. The 
classroom use of education technology will undoubtedly continue to expand and play an 
increasingly significant role in public education in the years to come as technology becomes 
more sophisticated and more cost-effective. This review highlights the need for more 
randomized studies. In addition, schools and districts should make concerted efforts to identify 
and adopt research-proven education technology programs to improve student academic 
achievement as well as to close the ability and language gaps in their schools. The technology 
approaches most widely used in schools, especially supplemental computer-assisted instruction, 
have the least evidence of effectiveness. Alternative uses of technology have greater promise. 
The U.S. Department of Education should continue to invest in evaluation of innovative 
programs and in creation of new technology. For example, interactive whiteboards have become 
increasingly popular in US public schools. Yet there is little experimental research on their 
outcomes or on effective ways of using these and other whole-class technologies. 

Limitations 



It is important to mention several limitations in this review. First, due to the scope of this 
review, only studies with quantitative measures of reading were included. There is much to be 
learned from other non-experimental studies such as qualitative and correlational research that 
can add depth and insight to understanding the effects of these education technology programs. 
Second, the review focuses on replicable programs used in realistic school settings over periods 
of at least 12 weeks, but it does not attend to shorter, more theoretically-driven studies that may 
also provide useful information, especially to researchers. Finally, the review focuses on 
traditional measures of reading performance, primarily standardized tests. These are useful in 
assessing the practical outcomes of various programs and are fair to control as well as 
experimental teachers, who are equally likely to be trying to help their students do well on these 
assessments. However, the review does not report on experimenter-made measures of content 
taught in the experimental group but not the control group, although results on such measures 
may also be of importance to researchers or educators. 
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Table 1: Summary of major meta-analysis in education technology 



Reviews 


Grade 


Number of Studies 


Effect Sizes 


Kulik & Kulik (1991) 


K-12 


18 


+0.25 


Becker (1992) 


K-8 


10 


+0.18 


Ouyang (1993) 


K-6 


20 


+0.16 


Fletcher-Finn & Gravatt 
(1995) 


K-12 


23 


+0.12 


Soe, Koki, & Chang 
(2000) 


K-12 


17 


+0.13 


Blok et al (2002) 


K-3 


42 


+0.19 


Kulik (2003) 


K-6 


24 


+0.06 to +0.43 
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Table 2 



Cherall Effect Sizes 



95% Test of 

confidence Test of heterogeneity 

interval Mean in effect sizes 





k 


ES 


SE 


Variance 


Lower 


Upper 


Z-value 


P-value 


Q-value 


df(Q) 


P-value 


1. Fixed 


85 


0.11 


0.01 


0.000 


0.09 


0.13 


12.33 


0.00 


362.53 


84 


0.000 


2. Random 


85 


0.16 


0.02 


0.000 


0.12 


0.21 


7.51 


0.00 









Table 3: Classic fail-safe N 



Z- value for observed studies 


13.83 


P-value for observed studies 


0.00 


Alpha 


0.05 


Tails 


2.00 


Z for alpha 


1.96 


Number of observed studies 


85.00 


Number of missing studies that would 
bring p-value to >alpha 


4198.00 



Table 4: O twin's fail-safe N 



Standardized difference in means in observed studies 


0.11 


Criterion for a ‘trivial' standardize difference means 


0.01 


Mean standardized difference in means in missing 
studies 


0.00 


Number of missing studies needed to bring standardized 
difference in means under 0.01 


880.00 
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TABLE 5 

3\ Publication 



Mixed effects analysis 
Publication 


k 


ES 


SE Variance 


95% confidence 
intenal 


Test of 
Mean 




Test of 

heterogeneity 
in effect sizes 


Lower 


Upper 


Z-value P-value 


Q-value 


df (Q) P-value 


1. Published 


21 


025 


0.05 0.002 


0.16 


035 


520 0.00 






2. Unpublished 


64 


0.14 


0.02 0.001 


0.09 


0.1S 


5.80 0.00 






Totalbetween (O 5 ) 














4.44 


1 0.04 



TABLE 6 

By Year of Publication 


Mixed effects analysis 
Research design 


k 


ES 


SE 


Variance 


95% confidence 
intenal 


Test of 
Mean 


Test of 
heterogeneity- 
in effect sizes 


Lower 


Upper 


Z-value 


P-value 


0-\alue df (0) P-value 


1.1970s 


1 


0.14 


0.16 


0.03 


-0.17 


0.45 


0.89 


0.37 




2.1980s 


15 


0.16 


0.05 


0.002 


0.07 


0.24 


3.70 


0.00 




3.1990s 


15 


0.08 


0.02 


0.000 


0.041 


0.11 


4.27 


0.000 




4.2000s 


48 


0.18 


0.03 


0.001 


0.119 


0.25 


5.68 


0.000 




5.2010s 


6 


0.17 


0.05 


0.003 


0.068 


0.27 


328 


0.001 




Tot al b etwe en (O 5 ) 


















11.14 4 0.03 
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TABLE 7 

By Design 



Mixed effects analysis 
Research design 


k 


ES 


SE 


Yanance 


95% confidence 
interval 


Test of 
Mean 


Test of 
heterogeneity 
in effect sites 


Lower 


Upper 


Z-value 


P-value 


O-value df (0) P-value 


1 . Randomized 


25 


0.08 


0.02 


0.001 


0.04 


0.13 


3.70 


0.00 




2.RQE 


3 


0.16 


0.12 


0.014 


-0.08 


039 


1.31 


0.19 




3. Matched 


48 


0.19 


0.04 


0.001 


0.12 


036 


5.44 


0.00 




4.MPH 


9 


0.19 


0.06 


0.004 


0.06 


0.31 


2.93 


0.00 




To t al b etwe en (O3) 


















7.88 3 0.05 


*MPH=Mat died p 0 st ho c; RQE=randomize d quasi-exp eriment 










□ 



TABLE 8 



By Design 



t u 

Mixed effects analysis 
Research design 


k 


ES 


SE 


Yanance 


95% confidence 
interval 


Test of 
Mean 


Test of 
heterogeneity- 
in effect sizes 


Lower 


Upper 


Z-value 


P-value 


O-value df (O) P-value 


1. Randomized 


25 


0.0s 


0.02 


0.001 


0.04 


0.13 


3.70 


0.000 




2 . Quasi-Experiments 


60 


0.19 


0.03 


0.001 


0.13 


035 


6.63 


0.000 




To t al b etwe en (Of) 


















8.42 1 0.00 
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TABLE 9 



3y Sample Size 



Mixed effects analysis 
Sample size 


k 


ES 


SE 


Variance 


95% confidence 
interval 


Test of 
Mean 




Test of 
heterogeneity 
in effect sizes 




Lower 


Upper 


Z-value 


P-value 


Q-value 


df(Q) 


P-value 


1 . Large 


49 


0.13 


0.02 


0.001 


O.OS 


0.1S 


5.42 


0.000 








2. Small 


36 


025 


0.05 


0.002 


0.15 


0.34 


5.35 


0.000 








To t al b etwe en ( 0 $ ) 


















4.66 


1 


0.03 



TABLE 10 

By Design and Size 


Mixed effects analysis 
Research design Size 


k 


ES 


SE 


Variance 


95% confidence 
interval 


Test of Mean 


Test of heterogeneity in effect 
sizes 


Lower 


Upper 


Z-value 


P-value 


Q-value 


df (Q) P-value 


1 . Large Randomized 


17 


0.07 


0.02 


0.001 


0.03 


0.12 


3.06 


0.00 






2. Small Randomized 


7 


0.21 


0.07 


0.005 


0.06 


0.35 


2.77 


0.00 






3 . Large Matched C ontrol 


31 


0.16 


0.04 


0.001 


O.OS 


0.23 


4.14 


0.00 






4. SmallMatchedControl 


30 


0.24 


0.05 


0.002 


0.14 


0.33 


4.97 


0.00 






To t al b etwe en (0$) 


















1231 


3 0.00 
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TABLE 11 



B\- Grade Levels 



Mixed effects analysis 






95% confidence 
interval 


Test of 
Mean 


Test of 
heterogeneity- 
in effect sites 


Grade k 


ES SE 


Variance 


Lower Upper 


Z-value P-value 


Q-value df (Q) P-value 


1 . Kindergarten S 

2. Elementary 59 

3. Secondary IS 


0.15 0.14 

0.10 0.02 
0.31 0.07 


0.019 

0.000 

0.004 


-0.12 0.42 

0.07 0.14 

0.18 0.44 


1.07 0.28 

6.34 0.00 

4.77 0.00 




To t al b etwe en (Os) 










9.52 2 0.01 


TABLE 12 












By Pro grams 












Mixed effects analysis 






95% confidence 
interval 


Test of 
Mean 


Test of 

heterogeneity- 
in effect sizes 


Types of program k 


ES SE 


Yanance 


Lower Upper 


Z-value P-value 


Q-value df (Q) P-value 


1 . ComputerManaged 

Learning 4 

2. Innovative Technology 

Applications 6 

3 . C omprehensiv e 1 S 

4. Supplemental 57 

Total between (O5) 


0.19 0.09 

0.1S 0.05 

0.2S 0.07 

0.11 0.02 


0.00s 

0.003 

0.005 

0.000 


0.02 0.36 

0.0S 0.28 

0.14 0.41 

0.07 0.15 


2.14 0.03 

3.51 0.00 

4.06 0.00 

522 0.00 


7.15 3 0.07 
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TABLE 13 

B\ Intensity 



Mixed effects analysis 
Intensity 


k 


ES 


SE Variance 


95% confidence 
interval 


Test of 
Mean 




Test of 
heterogeneity- 
in effect sizes 


Lower 


Upper 


Z-value P-value 


Q-value 


df (Q) P-value 


1 . High (>75mm a week) 


55 


0.19 


0.03 0.001 


0.13 


0.24 


6.31 0.00 






3 . Low (<7 5min a week) 


30 


0.11 


0.03 0.001 


0.06 


0.17 


3.99 0.00 






Total between (0£) 














3.04 


1 0.08 


Low=less than 7 5 minutes a week; High=more than 7 5 minutes a week 



TABLE 14 



By Implementation 



Mixed effects analysis 
Research design Size 


k 


ES 


SE 


Yanance 


95% confidence 
interval 


Test of Mean 


Test of heterogeneity in 
effect sizes 


Lower 


Upper 


Z-value 


P-value 


Q-value 


df (Q) P-value 


1 . Low 


6 


0.01 


0.03 


0.001 


-0.06 


0.07 


21 


0.79 






2. Medium 


17 


0.18 


0.04 


0.001 


0.11 


0.24 


4.99 


0.00 






3. High 


17 


0.22 


0.07 


0.005 


0.09 


035 


3.19 


0.00 






4.NA 


45 


0.16 


0.03 


0.001 


0.10 


0.22 


5.34 


0.00 






Totalbetween (Os) 


















1730 


3 0.00 



NA: no information about implementation 
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TABLE 15 



By SES — Between 
Studies 



Mixed effects analysis 
SES 


k 


ES 


SE Yanance 


95% confidence 
interval 


Test of 
Mean 




Test of 
heterogeneity- 
in effect sizes 


Lower 


Upper 


Z-value P-value 


Q-value 


df (Q) P-value 


1 . Low SES 


67 


0.17 


0.03 0.001 


0.12 


0.22 


6.68 0.00 






2. High SES 


14 


0.12 


0.05 0.002 


0.03 


0.21 


2.50 0.01 






Tot al b etwe en (0£) 














1.02 


2 0.31 



TABLE 16 

By SES — Within Studies 


Fixed effects analysis 
SES 


k 


ES 


SE 


Variance 


95% confidence 
interval 


Test of 
Mean 


Test of 
heterogeneity- 
in effect sizes 


Lower 


Upper 


Z-value 


P-value 


Q-value 


df(Q) 


P-value 


1 . Low SES 


6 


0.31 


0 . 0s 


0.00 


- 0.16 


0.47 


3.94 


0.00 


32.12 


5 


0.00 


2. High SES 


4 


0.20 


0.11 


0.01 


- 0.00 


0.41 


1.95 


0.05 


16.15 


3 


0.00 


Total within 


















4827 


S 


0.00 


Totalbetween ( O 5 ) 


















0.68 


1 


0.41 


Overall (Oj) 


10 


0.27 


0.06 


0.00 


0.15 


0.40 


4.32 


0.00 


48.95 


9 


0.00 
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TABLE 17 

Abilin- 



Mixed effects 
analysis 










95% confidence 
interval 


Test of 
Mean 




Test of 
heterogeneity- 
in effect sizes 




Ability 


k 


ES 


SE 


Variance 


Lower 


Upper 


Z-value P-value 


Q - 

value 


df(Q) 


P-value 


1 . Low 


12 


037 


0.11 


0.01 


0.15 


0.58 


3.33 0.00 








2. Middle 


S 


037 


0.08 


0.01 


0.10 


0.43 


3.26 0.00 








3. High 

To t al b etwe en (O 5 ) 


9 


0.08 


0.07 


0.01 


-0.05 


0.22 


1.19 0.24 


5.85 


2 


0.05 



TABLE 18 

Gender 


Mixed effects analysis 
Gender 


k 


ES 


SE Variance 


95% confidence 
interval 


Test of 
Mean 




Test of 
heterogeneity- 
in effect sizes 


Lower 


Upper 


Z-value P-value 


Q-value 


df (Q) P-value 


1 . Males 


10 


038 


0.11 0.01 


0.06 


0.49 


2.50 0.01 






2. Females 


10 


0.12 


0.08 0.01 


-0.03 


037 


1.56 0.12 






To t al b etwe en (O 5 ) 














1.34 


1 035 
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TABLE 19 

Race 



Fixed effects analysis 
Race 


k 


ES 


SE 5 


.'anance 


95% confidence 
interval 


Test of 
Mean 


Test of 

heterogeneity 
in effect sizes 


Lover 


Upper 


Z-value 


P-value 


Q-value 


df (Q) 


P-value 


1 . African American 


4 


0.12 


0.03 


0.00 


-0.05 


0.18 


3.57 


0.00 


26.06 


3 


0.00 


2. Hispanic s 


1 


0.42 


0.28 


0.08 


- 0.12 


0.96 


1.51 


0.13 


0.00 


0 


1.00 


3. White 


4 


0.11 


0.05 


0.00 


0.02 


0.20 


1.32 


0.02 


12.89 


3 


0.00 


Total within 


















38.98 


6 


0.00 


Total between (O 5 ) 


















1.22 


2 


0.55 


Overall (Or) 


9 


0.11 


0.03 


0.00 


0.07 


0.17 


4.42 


0.00 


40.16 


8 


0.04 



TABLE 20 

English Language 
Learners 


Fixed effects analysis 
Eng Language Learners 


k 


ES 


SE Variance 


95% confidence 
interval 


Test of 
Mean 


Test of 
heterogeneity- 
in effect sizes 


Lover 


Upper 


Z-value P-value 


Q-value 


df(Q) 


P-value 


ELL 


3 


0.29 


0.05 0.00 


0.20 


0.38 


6.27 0.00 


0.05 


2 


0.975 


Total within 














0.05 


2 


0.975 


To t al b etw-e en (O 5 ) 














0.00 


0 


1.00 


Overall (Of) 


3 


0.29 


0.05 0.00 


0.20 


0.38 


6.27 0.00 


0.05 


2 


0.975 
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KINDERGARTEN 


Study 


Design 

LargeSmall 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Program 

Intercity 


Ch e rail E S 


Co mp rehensive Models 


Writing to Read 


Stevenson et al. (19 SS) 


Matched (S) 


1 year 


241 students 
(86E, 155C) 


K 


Afh c an Am eri can 
students in 
Washington. DC 


MAT Reading 


15 -min 
daily 


+0.35 


Granick & Reid(19S7) 


Mat died (S) 


1 year 


2 schools 
73 Sudents 

(37E.36C1 


K 


Hi gh-p overtv Afh can 
Amen can scho d s in 
Baltimore 


MAT 


15 -min 
daily 


+0.02 


Voyager U nh ersal Literacy System 




Frechtling et d (2006) 


Matched (L) 


1 year 


S schools 
39S students 
(202E, 196C) 


K 


Hi gh-p overtv Afh can 
American inner city 
schools 


DIBELS CTOPP 
Woodcock 


portion of a 
daily 2-hr 
instructional 
block 


+0.62 


Hecht (2003) 


Mat died (S) 


5 months 


4 schools 
(10 IE, 1 12C) 


K 


Hi gh-p overtv Afh can 
American schools 


Woodcock 


portion of a 
daily 2-hr 
instructional 
block 


+0.06 


Supplemental CAI Programs 


Waterford E arh ReadingProgram 


Paterson et al. (2003) 


Mat died (L) 


1 year 


16 dasses 
(8E, 8C) 
C49E. 590 


K 


High-poverty 
c omnium tym western 
New Y ork 


Clay Word 
Recognition Test 


15 -min 
daily 


0.00 


Tracey & Young (2006) 


Mat died (L) 


1 year 


15 dasses 
(8E, 7C) 
265 children 
C15 IE . 1 140 


K 


High-minonty 
northeastern 
comm uni ty 


TERA-2 


15 -min 
daily 


+0.47 


The Literacy Center (LeapFrog) 




RMC (2004) 


Randomized 
Quasi - 
Experiment 

(L) 


1 year 


6 schools 
258 students 
(126E, 132C) 


K 


Hi gh-p overtv schools 
in Las Vegas, 30% 
ELL 


Gates MacGinitie 
DIBELS 


20-30 min 
daily 


+0.14 


Destination Reading 




Barnett (2006) 


Mat died (L) 


1 year 


15 dasses 

(8E, 1C) 


K 


High-poverty high- 
minority community 
inFL 


DIBELS 
Clay Word 
Recognition 
Ddch 


2x 20-min 
weekly 
(minimum) 


-0.53 
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Study 


Design 

Large/Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Program 

Intensity 


Overall E S 


ELEMENTARY 


Comprehensive Models 


Writing to Read 


Colli s, Ollila & Olilla (1 990) 


Matched (S) 


1 year 


97 students (53E, 
44 C) 


1 


Schods in British 
Cdumbia, Canada 


SAT 


15 -min 
daily 


+0.27 


Beasley(1989) 


Matched(S) 


6 months 


74 students 

(42E. 32C) 


1 


Middle-class students 
in Athens. AL: 

82 %W. 18%AA 


SE SAT-2 


15 -min 
daily 


+0.19 


Innovative Technology AddI 


cations 


Reading Reels 


B. Chambers et al. (2006) 


Randomized 

<L) 


1 year 


10 schools 
394 students 
(1S9E. 205C) 


1 


High-poverty schools 
inHartford. CT 
6 1% H 35% AA 


Woo dc ode 
DIBELS 


5 -min 
daily 


+0.17 


B. Chambers et al. (2008) 


Randomized 

(S) 


1 year 


2 schools 
159 students (7 5E, 
84 C) 


1 


Hi spani c students in 
high-poverty schools 
in Los Angdes and 
Las Vegas 


Woo dc ode 
GORT 


20-min 

daily 


-0.27 


FastForWord 


Marion (2004) 


Matched (L) 


1 year 


349 students (21 5E. 
134C) 


5.6 


Schools in 

Appalachian IN. 52% 
FL. 100%W 


Terra Nova 


Not stated 


+0.25 


Scientific Learning (2006) 


Mat died (S) 


15 weeks 


142 students 

(55E. 87C) 


5,6 


Middle class schods 
in Northwest OH 


Gates MacGini tie 


Not stated 


+0.11 


Rouse & Krueger (2004) 


Randomized 

<L) 


1 year 


4 sdiods 
454 students 

(237E, 217C) 


3-6 


High-poverty 
northeastern city r 
schods 

59%FL, 66% H, 27% 
.AA61%ELL 


Connecticut 
Mastery 7 Test 


90-100 min 
daily 


+0.05 


Lightspan 


Birch (2002) 


Matched post 
hoc (S) 


2 years 


101 students 

(50E. 5 1C) 


13 


Schools in the Caesar 
Rodney School 
Distnct inDE 


SAT 


60 -min 
weekly 7 
(minimum! 


+0.42 
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Study 


Design 

Large'Small 


Duration 


N 


Grade 


Sample 

C haracteristics 


Posttest 


Program 

Intensity 


Overall ES 


Computer-Manased L ea ming System s 


Accelerated Reader 


Knox (1996) 


Randomized 

(S) 


3 months 


77 students 

(40E, 37C) 


3,4 


L ok SES students in a 
southeastern state. 
72% FL, 79% W. 13% 
AA S 0 oH. 


DRS & SAT 


portion of a 
dail\'60-min 
reading 
program 


-0.03 


Yee (2007) 


Matched (L) 


1 year 


3 schools 
(IE, 2C) 
2072 students 
f612E. 1460CI 


2-5 


Majonty-Hi spani c 
schools in Los 
Angeles Co. 

92% FL. 79% H. 17% 


CST 


portion of a 
daily 60-min 
reading 
urogram 


+0.06 


Nunnery & Ross (2007) 


Matched (L) 


1 year 


1 8 schools 
912 students 
(45 0E, 462C) 


5 


4 middle school sin a 
suburban T ex as scho ol 
district 


TAAS 


portion of a 
daily T 60-min 
reading 
urogram 


+0.22 


Supplemental CAI Programs 




Destination Reading 




Campuzano et al. (2009) 


Randomized 

(L) 


1 year 


21 teachers 
(2 IE. 14C) 
742 students 
CUS E. 2940 


1 


Schools an oss the 
U.S. 7 1% FL, 3 1% 
AA, 34® oH. 34% W 


SAT-10 


2x20-min 

weekly' 

(minimum) 


+0.09 


Rabiner et al(2010) 


Randomized 

(S) 


1 year 


5 schools 
77 students 

(52E, 25C) 


1 


Mo sdy r African 
American and 
Hi spani c students in 
the southeastern 
United States 


wjm 


2x60-min 

weekly 


+0.26 


Head sprout 




Campuzano et al. (2009) 


Randomized 

(L) 


1 year 


63 teachers 
(32E, 31C) 
1.079 students 
(574E, 505C) 


1 


Schools across the 
U.S. 35% FL, 81% W. 
13% AA 67%H 


SAT-10 


3x30-min 

weekly 


+0.01 


Rato Focus 




Campuzano et al. (2009) 


Randomized 

(L) 


1 year 


29 teachers 
(15E, 14C) 

618 students (3 2 7E. 
291 C) 


1 


Schools across the 
U.S. 48% FL, 67 %W, 
27%H, 5% AA 


SAT-10 


15-30 min 
daily 


+0.02 
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Study 


Design 

Large/Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Program 

Intensity 


Overall ES 


Waterford E arh' Reading Pro gram 




Campuzano et al. (2009) 


Randomized 

(L) 


1 year 


46 teachers 
(28E, 18C) 
1,155 students 
(6S9E, 466C) 


1 


Schools across the 
U.S. 4~°cFL, 37?oAA_ 
1 6°oH 


SAT-10 


17-30 min 
daily 


+0.02 


Cassady& Smith (2005) 


Matched (S) 


1 year 


6 dasses 
(3E, 3C) 
93 students 
(46E, 47C) 


l 


School in rural 
midwest 


Terra Nova 
Reading 


20 -min 
daily 


+0.71 


Lexia 




Macantso, Kook, & McCabe 
(2006) 


Matched (S) 


7 months 


5 schools 
10 dasses 
(5 E, 5C) 
179 students 

(92 E, 87 C) 


1 


Boston area 
50%FL 


Gates MacGinitie 


2-4x20-30 

min 

weekly 


+0-20 


The Literacy Center (LeapFrog) 




RMC (2004) 


Randomized 
Quasi - 
Experiment 

(S) 


1 year 


6 schools 
195 students 
(109E, S6C) 


l 


High- poverty schools 
in Las Vegas, 30% 
ELL 


Gates MacGinitie 
DIB ELS 


20-30 min 
daily 


-0.04 


Erdner, Guy, & Bush (199 7) 


Matched (S) 


1 year 


2 schools 
S5 students 
(45E, 40C) 


l 


Schools in north 
central OK 


CTBS 


3x20-min 

weekly 


+0.75 


Reading Machine 




Abram (19 S4) 


Randomized 

(S) 


12 weeks 


103 students 

(53E, 50C) 


t 


Not stated 


ITBS 


3xl5-min 

weekly 


+0-29 


Academy of Reading 




Campuzano et al. (2009) 


Randomized 

(L) 


1 year 


41 teachers 
(22E, 19C) 
S99 students 
f495E. 404C1 


4 


Schools across the 
U.S. 65%FL, 54%AA_ 
29°oH, 17%W 


SAT-10 


3x25-min 

weekly 

(minimum) 


-0.01 


LeapTrack 


Campuzano et al. (2009) 


Randomized 

(L) 


1 year 


5 5 teachers (29E, 
26 C) 

1274 students 
(66 5E, 609C) 


4 


Schools across the 
U.S. 61%FL, 57%AA_ 
33%W, 1 0°oH 


SAT-10 


3xl5-min 

weekly 

(minimum) 


+0.09 



41 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Design 

Large-Small 


Duration 


N 


Grade 


Sample 

C haracteristics 


Posttest 


Program 

Intercity 


Overall ES 


JostensiEarlier form of Con 


d ass Lea mint 


0 


Alifrangis(1991) 


Randomized 

(S) 


1 year 


12 dasses 
(6E, 6 C) 


4-6 


School at an army 
base near Washington, 
D.C. 3 7% minority’ 


CTBS Reading 


3x20-min 

weekly’ 


+0.15 


Better (1994) 


Randomized 

(S) 


1 year 


1 school 
187 students 


2-5 


Inner city’ Baltimore 
High poverty. 


CAT 


3x3 0-min 
weekly’ 


+0.09 


Standish (1995) 


Matched (S) 


1 year 


2 schools 
139 students 

C56E.S3C'l 


2 


Students in suburban 
DE 


MAT 6 Reading 
Comprehension 


2x2 5-min 
weekly’ 


+0.05 


Estep (1997) 


Matched post 
hoc (S) 


4 years 


106 schools 

(53E, 53C) 


3 


E 1 em entary’ scho ol s in 
IN 


ISTEP 


not stated 


+0.03 


Sinkis (1993) 


Matched (L) 


1 year 


422 students 
(22 8E, 194C) 


3, 5,6 


Chapter One students 
in a large urban school 
system in the northeast 


MAT 


3x2 0-min 
weekly’ 


+0.12 


Compass Learning 


Kadd Re search Consulting 

(2006) 


Matched post 
hoc (L) 


2 years 


59S students 
(159. 439C) 


4,5 


GarfiddKdshts, OH 
50% FL, 63% W, 24% 
H. 13% AA 


OAT 


120-min 

Monthly 


+0.29 


CCC Successmaker 


Campbell (2000) 


Matched (L) 


1 year 


13 schools 
(7 E. 6 C) 
701 students 
(31 0E, 391C) 


4,5 


Middle dass students 
inEtowah, AL 


SAT 


10-15 min 
daily 


-0.02 


Ragosta (1983) 


Matched (L) 


3 years 


6 schools 
(4E. 2C) 

Eight 1-year cohorts 
Three 2-year 
cohorts 

One 3-year cohort 


4-6 


High poverty’ scho ol s 
in Los Angdes 


CTBS 


10-20 min 
daily 


+0.17 


Saracho (1982) 


Matched (L) 


1 year 


256 students 
(12 8E, 128C) 


3-6 


Spanish- speaking 
migrant students 


CTBS Reading 


IS 0-min 
weddy’ 


-0.09 
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Study 


Design 

Large/Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Program 

Intensity 


Oe rail ES 


Glassworks Gold 


Whitaker (2005) 


Matched post 
hoc (S) 


1 year 


2 schools 
220 students 

(123E, 97C) 


4,5 


Schools in rural 
Tennessee, 62% Low 
SES. 


TCAP 


2x45-min 

weekly' 


-0.14 


My Reading C oach 


Vaughan, Seri do, & Wilhelm 

(2006) 


Randomized 

<L) 


1 year 


4 schools 
2S4 students 

(127E, 157C) 


24 


Predominately' 
minority' students from 
4 schools in 3 states; 
27% ELL s 36% AA 
36% H, 22% W r 


GRADE 


34 x 45-min 
weekly 


-41.24 


WTCAT 


Miller (1997) 


Matched post 
hoc (L) 


3 years 


30 schools 
(10E, 20C) 


3-5 


New York City Public 
Schools, almost all 
.AA or Hispanic, 16 
ESL 


DRP 


15 -min 
daily' 


+0.02 


Clayton (1992) 


Matched post 
hoc (L) 


1 year 


5 schools 
(1H.4C) 
426 students 
(1S1E.245C) 


2-5 


Schods in northwest 
SC. 46%FL, 59%W T , 
39% AA 


CTBS 


25 -min 

daly 


-0.01 


Open Book to Literacy 


Williams (2005) 


Matched (S) 


1 year 


2 schools 
(IE. 1C) 
127 students 
('66E.61C1 


4 


High-poverty schools 
in Memphis, 5 1% W T , 
24% H, 21%AA 


TORC 


30 -min 
daily' 


-0.28 


Award Reading 


Block etal (2007) 


Matched (L) 


20 weeks 


113S students 
(56 9E, 569C) 


K, 1 


High- poverty schools 
in Tex as. New Jersey; 
and New Y ork 


Word Reading 
DIB ELS 


some 

techndoey 

daily’ 


+0.11 


Lexia 


Faux (2004) 


Matched (L) 


1 year 


26S students 
(137E, 131C) 


1-3 


Low achienng 
students in Boston 
public schools 


DRA 


60 -min 
weekly' 


-0.07 
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Study 


Design 
Large^ Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Program 

Intercity 


Overall ES 


Kid Biz? 000 


Tracey & Y oung (2004) 


Matched (S) 


1 year 


5 schools 
168 students 
(S4E, 84C) 


5 


Mosdywhite students 
in a small, northeast 
tityin New York 


Vocabulary 

SRI 

Comprehension 


2x 40 -min 
weekly 7 


+0.17 


Multimedia CD-ROM 


Schardt(1997) 


Randomized 

(S) 


12 wedcs 


96 students 
(4SE, 48C) 


3,4 


Hispanic LEP students 
in Tyler, Texas 


TAAS 


15-20 min 
daily 


+0.18 


Computer-assisted remedial reading instruction (CARR 


1 


Saine et al (2010) 


Randomized 

(S) 


28 wedcs 
intervention 
2 year follow 
up 


50 students 

(25E, 25C) 


1 


PI Finnish students in 
a middle-class 
suburban area 


Letter knowledge 
Reading fluency 7 


4x45-min 

weekly 


-0.64 


Compass Learning Odvssev 


DiLeo (2007) 


Randomized 

Quasi- 

Experiment 

(S) 


1 year 


4 schods 
207 students 

(125E, 82C) 


5 


Mostly White students 
in alow SES schod 
di stri ct in c erdr al PA 


PSSA 


30-min 

daily 7 


-0.38 


Read About 


James-Burdumy et al (2009) 


Randomized 

(L) 


1 


2613 students 
(1246E, 1367C) 


5 


Mostly White students 
in 10 districts across S 
states 


TOSCRF 


2x40-min 

weekly 7 


-0.04 


ABRACADABRA 


Wolgemuthet al (2010) 


Matched (S) 


16 weeks 


166 students 
(1 1SE. 4SC) 


1,2 


Students from 
Northern Temonty 
Indigenous classrooms 
in Australia 


GRADE 


4x30-min 

weekly 7 


+0.10 


Wolgemuthet al (2010) 


Randomized 

(L) 


1 semester 


17 d asses 
308 students 
(163E, 145C) 


K to Year 2 


Students from six 
schools in three 
Northern Temonty 
cities in .Australia 
Alice Springs. 
Darwin, Palmerston 


GRADE 


4x30-25 min 
weekly 7 


-0.22 
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Study 


Design 

LarseSmall 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Program 

Intensity 


Overall ES 


Savage et al (20 10) 


Randomized 

(L) 


1 


74 classrooms 
1067 students 
(54 9E, 5 ISC) 


K to Y ear 2 


23 non-deminational 
inner city and 
suburban schools from 
three Canadian 
provinces 


Letter Sounds 
Blending Words 
Listening 
Comprehension 


ISO-min 

weekly 


+0.21 


Other Supplemental CAI 


Dynarski et al (2007): 

DestinationReading 

Waterford 

Keadsprout 

Plan Focus 

Academy of Readins 


Randomized 

(L) 


1 year 


2619 students 
(1516E, 1 103C) 


1 


National. 

49° oFL, 44% W 
3 1 %AA, 22 %H 


SAT-9 


About 20- min 
daily 


+0.04 


Dynarski et al (2007): 

D estination R eading 

Waterford 

Keadsprout 

Plan Focus 

Aca dein v of R e adi ns 


Randomized 

(L) 


1 year 


2265 students 
(123 IE, 1034C) 


4 


National 
64°<fL, 17% W, 
57%AA, 23 %H 


SAT-9 


About 20- min 
daily 


+0.02 


Ramey (1991) 


Matched (L) 


1 year 


282 students 
(62E.220C) 


2-5 


Urban Washington 
State 


CAT-R eading 


Not stated 


+0.22 


Bass. Ries & Sharpe, (19S6) 


Matched (S) 


1 year 


2 schools 
(IE, 1C) 
145 students 

(73 E, 72 C) 


5,6 


Hi gh-p overty schools 
in rural V A 


SRA 

Virginia Basic 
Learning Skills 


30 -min 
weekly' 


+0.1S 


Easterling (19S2) 
(Micro System SO) 


Randomized 

(S) 


4 months 


2 schools 
42 students 

(2 IE, 2 1C) 


5 


Schools in suburban 
school district 


CAT Reading 
Comprehension 


2xl5-min 

weekly 


+0.01 


Schmidt (1991) 
(Wasatch US) 


Matched (L) 


1 year 


4 schools 
(2E,2C) 
1,224 students 
(646E, 578C) 


2-6 


Schools in Southern 
CA. 25%FL 


CTBS 


20 -min 
daily 


+0.04 


Cooperman (19S5) 


Matched (L) 


1 year 


3 schools 
(IE, 2C) 
470 students 
(204E. 266 C) 


2-4 


Students from 3 low to 
middle class schools. 
86% W, 13% AA 


CAT 


10-min 

daily 


-0.06 


Bryg (19S4) 


Matdied (S) 


15 weeks 


9 teachers 
(5E, 4C) 
152 students 
fS3E. 69Ci 


4 


Large urban schools in 
Omaha, NE 


CAT Reading 
Comprehension 


not stated 


+0.20 
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Srudy 


Design 

Larae/Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Program 

Intensity 


Overall £ S 


Roth & Bede (1987) 


Matched (S) 


1 year 


6 dasses 
(3E, 3C) 
108 students 
(59E, 49C) 


4 


Hi ah-po verty 1 ow- 
achieving urban 
schods. 100% AA. 


Woodcock Word 
Attack & CAT 


3x20-min 

weekly 


+0.38 


Coomes (19S5) 


Matched (S) 


1 year 


4 schools 
102 students 
15 IE. 510 


4 


Middle class schools 
inTX. 90% W 7 . 


CTBS 


30-min 

weekly 


+0.02 


Hoffman (1984) 


Matched (S) 


1 year 


3 schools 96 

students (5 IE, 

45 C) 


3 


Schools in suburban 
midwest 11% 
minority 


Gates MacGinitie 


10 -min 
daily 


-0.07 


Levy (1985) 


Matched post 
hoc (L) 


1 year 


4 schools 
581 students 
(293E.2SSC) 


5 


Suburban NY schod 
district 


SAT 


3x20-min 

weekly 


+0.19 


SE CONDARY 


Comprehensive Models 


Read 180 


White, Had am. & Hewes 
(2006) 


Matched (L) 


1.5 years 


1652 students 
(82 6E, S26C) 


9, 10 


Students with low 
reading scores in 
Phoenix, AZ 


SAT-9 

AIMS 


90-min daily 
(20 -min CAI) 


+0.12 


1 semester 


1630 students 
(81 5E, 815C) 


9 


Pap alewis (2004) 


Matched (L) 


1 year 


1073 students 
(537E. 536C) 


S (mostly), 
retained 


Low-performing 
students in Los 
Anaeles 


SAT-9 


90-min daily 
(20-min CAI) 


+0.6S 


Mims, L owther, Strahl. & 
Nunnery (20 06) 


Matched (L) 


1 year 


1000 students 


6-9 


Mostly African 
American sudents in 
Little Rock, AR 


ITBS 


90-min daily 
(20-min CAI) 


-0.12 


Interactive, Inc (2002) 


Matched (L) 


1 year 


800 students 

(387E, 323C) 


6-S 


Two middle schods 
from each of Boston, 
Houston, Dallas, and 
C' dumbus 


SAT-9 


90-min daily 
(20-min CAI) 


+0.24 


Haslam, White, & Klinge 
(2006) 


Matched (L) 


1 year 


614 students 

(30 7E, 307C) 


7,8 


L ow perfoming 
students in Austin, TX 


TAKS 


90-min daily 
(20-min CAI) 


+0.18 
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Study 


Design 

Large/Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Program 

Intensity 


Overall ES 


Woods (2007) 


Matched (L) 


1 year 


26S students 
(134E, 134C) 


6-8 


L ow- performing 
mostly African 
American students in 
southeastern Virginia 


DRP & STAR 


90- min every" 
other day" 


40.43 


Caggiano(2007) 


Matched (S) 


1 year 


120 students 
(60E, 60C) 


6-8 


L ow- performing 
mostly African 
Am eri can students in 
southeastern Vir am a 


Virginia SOL 


90-min every" 
other day" 


40.01 


Nave (2007) 


Mat died post 
hoc (S) 


1 year 


1 10 students 
(80E, 30C) 


7 


At-risk students in 
Sevier C ounty; TN 


TCAP 


90-min every 
other day" 


4-1.58 


Scholastic Re search (200 8) 


Mat died (L) 


1 year 


570 students 
(285E, 285C) 


6,7,9 


Mostly ELL students 
in the Desert Sands 
Unified School 
District in C A 


CST_ELA 


90-min (20- 
min CAT) 


40.14 


Lang et al (2010) 


Randomized 

(L) 


1 year 


599 students 

(30 7E. 292C) 


9 


Struggling readers in a 
low SES school 
district 


FCAT 


90-min daily 
(20-minCAI) 


40.04 


Vovaaer Passport 


Shneyderm an (2006) 


Mat died (L) 


1 year 


S schools 
(4E. 4C) 
847 students 
(45 3E. 394C) 


9, 10 


Mo sdy T Hisp anic E SL 
students in Miami, FL 


FCAT 


Not stated 


40.17 


Denson (2008) 


Mat died (S) 


1 year 


1 school 
182 students 
(123E, 59C) 


9 


Mostly Hispanic 
students in a low SES 
urban high school 


TAKS 


Not stated 


40.38 


Innovative Technolog}' Applications 


Carn-a-T une (CAT) 


Biggs et al (2008) 


Mat died (S) 


16 weeks 


1 school 
46 students 

(24E. 22C) 


7,8 


Mostly' White students 
in a low SES rural 
middle school in 
Florida 


Qualitative 
Reading Inventory" 


3x30-min 

weekly 


4-1.02 


Computer-Managed Learning Svstems 


Accelerated Reader 


Nunnery & Ross (2007) 


Matched (L) 


1 year 


4 schools 
848 students 
(40 0E, 448C) 


S 


4 middle schools in a 
suburban Tex as scho ol 
district 


TAAS 


portion of a 
daily"60-min 
reading 
or oar am 


40.38 
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Study 


Design 

Large/Small 


Duration 


N 


Grade 


Sample 

C' haracteristics 


Posttest 


Program 

Intensity 


Ch e rail E S 


Supplemental CA1 Programs 


Jostens 


Hunter (1994) 


Matched (L) 


2S weeks 


6 schools 
(3E. 3C) 
270 students 

(135E. 135C) 


6-8 


Schools in rural 
Jeff erson C ounty; 
Georgia 


ITBS 


30 -min 
daily 


+0.31 


Computer Curriculum Corporation 


Liston (1991) 


Matched post- 
hoc (L) 


1 year 


49 schools 
(26E. 23C) 
4597 students 
(2,288E 2*309C) 
in 2 cohorts 


10 


Remedial students in 
South Carolina; 72% 
African American and 
2S%\Vhite 


South Caroline 
Exit Exam 


Not stated 


+0.06 


Other SupplementalCAI 


Chiang, Stauffer, andCannara 

(1978) 


Matched (S) 


1 year 


8 schools 
(4E. 4C) 
16S students 
(99E. 69C) 


Junior hi gh 
school 


Special education 
students in Cupertino. 
CA 


PLAT 


33 -min 
weekly 


+0.14 


Metrics Associates (19S1) 


Matched (S) 


1 year 


105 students 

(70E. 35C) 


7-9 


Two Massachusetts 
school districts 


MAT reading 


10-min 

daily 


+0.56 
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